The Annals of Statistics 

2007, Vol. 35, No. 5, 2018-2053 

DOI: 10.1214/0009053607000000244 

© Institute of Mathematical Statistics, 2007 

GOODNESS-OF-FIT TESTS VIA PHI-DIVERGENCES 

By Leah Jager^ and Jon A. Wellner^ 
Grinnell College and University of Washington 

A unified family of goodness-of-fit tests based on (^-divergences 
is introduced and studied. The new family of test statistics Sn{s) in- 
cludes both the supremum version of the Anderson-Darling statistic 
and the test statistic of Berk and Jones [Z. Wahrsch. Verw. Gebiete 
47 (1979) 47-59] as special cases (s = 2 and s = 1, resp.). We also 
introduce integral versions of the new statistics. 

We show that the asymptotic null distribution theory of Berk 
and Jones [Z. Wahrsch. Verw. Gebiete 47 (1979) 47-59] and Wellner 
and Koltchinskii [High Dimensional Probability III (2003) 321-332. 
Birkhauser, Basel] for the Berk-Jones statistic applies to the whole 
family of statistics Sn{s) with s G [—1,2]. On the side of power be- 
havior, we study the test statistics under fixed alternatives and give 
extensions of the "Poisson boundary" phenomena noted by Berk and 
Jones for their statistic. We also extend the results of Donoho and Jin 
[Ann. Statist. 32 (2004) 962-994] by showing that all our new tests 
for s G [~1)2] have the same "optimal detection boundary" for nor- 
mal shift mixture alternatives as Tukey's "higher-criticism" statistic 
and the Berk- Jones statistic. 

1. Introduction. In this paper we introduce and study a new family 
of goodness-of-fit tests which includes both the supremum version of the 
Anderson-Darling statistic (or, equivalently, Tukey's "higher criticism" statis- 
tics as discussed by Donoho and Jin [15]) and the test statistic of Berk 
and Jones [5] as special cases. The new family is based on phi-divergences 
somewhat analogously to the phi-divergence tests for multinomial families 
introduced by Cressie and Read [10], and is indexed by a real parameter 
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s € M: s = 2 gives the Anderson-Darling test statistic, s = 1 gives the Berk- 
Jones test statistic, s = 1/2 gives a new (Helhnger-distance type) statistic, 
s = corresponds to the "reversed Berk- Jones" statistic studied by Jager 
and Wellner [24] and s = — 1 gives a "Studentized" (or empirically weighted) 
version of the Anderson-Darling statistic. We introduce the corresponding 
integral versions of the new statistics (but will study them in detail else- 
where). Having a family of statistics available gives the possibility of better 
understanding of individual members of the family, as well as the ability to 
select particular members of the family that have different desirable prop- 
erties. 

In Section 2 we introduce the new test statistics. In Section 3 we briefly 
discuss the null distribution theory of the entire family of statistics, and note 
that the exact distributions can be handled exactly for sample sizes up to 
n = 3000 via Noe's recursion formulas (and possibly up to n = 10"^ via the 
recursion of Khmaladze and Shinjikashvili [31]) along the lines explored for 
the Berk- Jones statistic by Owen [36]. We also generalize the asymptotic 
distribution theory of Jaeschke [22] and Eicker [17] for the supremum version 
of the Anderson-Darling statistic, and of Berk and Jones [5] and Wellner and 
Koltchinskii [43] for the Berk- Jones statistic, by showing that the existing 
null distribution theory for s = 1 and s = 2 applies to (an appropriate version 
of) the whole family of statistics. We generalize the results of Owen [36] by 
showing that our family of test statistics provides a corresponding family of 
confidence bands. 

In Section 4 we study the behavior of the new family of test statistics un- 
der fixed alternatives. We show that for < s < 1 and fixed alternatives the 
test statistics always converge almost surely to their corresponding natural 
parameters. For 1 < s < oo, we provide necessary and sufficient conditions 
on the alternative d.f. F for convergence to the corresponding natural pa- 
rameter to hold, and show that the "Poisson boundary" phenomena noted 
by Berk and Jones for their statistic continues to hold for 1 < s < oo and for 
s < by identifying the Poisson boundary distributions explicitly. We also 
briefly discuss further large deviation results and connections between the 
work of Berk and Jones [5] and Groeneboom and Shorack [19]. 

In Section 5 we extend the results of Donoho and Jin [15] by showing that 
all our new tests for s € [—1,2] have the same "optimal detection boundary" 
for normal shift mixture alternatives as Tukey's "higher-criticism" statistic 
and the Berk- Jones statistic. 

Our new family of test statistics not only provides a unifying framework 
for the study of a number of existing test statistics as special cases, but 
also gives the possibility of "designing" a new test with several different 
desirable properties. For example, the new statistic S'n(l/2) satisfies both: 

(a) it consistently estimates its "natural parameter" for every alternative and 

(b) it has the same optimal detection boundary for the two point normal 
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mixture alternatives of Donoho and Jin [15] as the existing test statistics 
<S'„(2) and ^^(l) considered by these authors. 

2. The test statistics. Consider the classical goodness-of-fit problem: 
suppose that Xi, . . . ,Xn are i.i.d. F, and let F„(x) = n~^J2i'=i ^{^i ^ ^} 
be the empirical distribution function of the sample. We want to test 

H:F = Fo versus KiFj^Fq, 

where Fq is continuous. By the probability integral transformation, we can, 
without loss of generality, suppose that Fq is the uniform distribution on 
[0, 1], Fq{x) = (x a 1) V 0, and that all the distribution functions F in the 
alternative K are defined on [0, 1] . The basic idea behind our new family 
of tests is simple. For fixed x £ (0, 1), the interval is divided into two sub- 
intervals [0,x] and and we can test the (pointwise) null hypothesis 
Hx :F{x) = X versus the (pointwise) alternative Kx :F{x) ^ x using any of 
the general phi-divergence test statistics K^{¥n{x),x) proposed by Csiszar 
[11] (see also Csiszar [12] and Ali and Silvey [2]) and studied further in a 
multinomial context by Cressie and Read [10], where (p is a convex function 
mapping [0, oo) to the extended reals MU {oo} (cf. Liese and Vajda [32], 
pages 10 and 212, and Vajda [39]). Then our proposed test statistics are of 
the form 

Sn{(l>) =SUpK^{¥n{x),x) 

X 

or 

Tn{(t>)= [ K^{¥n{x),x)dx, 
Jo 

where the supremum and/or integral over x may require some restriction 
depending on the choice of (p. 

In our particular case, we define (j) = (ps for s G M by 

([l-s + sx-x']/[s{l-s)], s/0,1, 
(psix) = < x{logx - I) + 1 = h{x), s = l, 

[ log(l/x) -I- X — 1 = /i(x), s = 

(cf. Liese and Vajda [32], page 34), so that 

Ks{u,v) = v(t>s{u/v) + (1 - v)(t>s{{l - u)/{l - v)) 

s(l - s) 

Note that this definition makes (ps continuous in s for all x in (0,1), and 
hence, Ks is continuous in s for all (u, v) G (0, 1)^. Also note that K\j^i{p, q) = 
/2 (p : q), where p = {p, I — p) , q = {q, I — q) , and /2 (P • q) is as defined in (5.1), 
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Cressie and Read [10], page 456. Then our proposed test statistics S'n(s) and 
Tn{s) for s G M are defined by 



(1) Sn{s)^ 

and 

(2) Tn{s) 



sup Ks{¥n{x),x), ifs>l, 

0<x<l 

sup Ks{¥n{x),x), if s < 1 

X{i)<x<X{„) 



1 

Ks(Fn{x),x) dx, if s > 0, 

rXi. 



/ Ks{¥n{x),x)dx, ifs<0. 

The reasons for changing the definitions of the statistics by restricting the 
supremum or integral for different values of s will be explained in Section 3; 
basically, the restrictions must be imposed for some appropriate value of s 
in order to maintain the same null distribution theory for all values of s in 
[-1,2]. 

The most notable special cases of these statistics are s G {—1, 0, 1/2, 1, 2}: 
it is easily checked that 

K2[U,V) = - 



2v{l-v) 

Ki{u,v) =nlog^^^ + (1 - ^/)log^^— 
K,/2{u,v) = 4{1 - V^- ^{l-u)il-v)} 



/ v\ ( \ — V 

Ko{u,v) =Ki{v,u) =vlog[ - + (1 -u)log 



K.,iu,v)=K2{v,u) 



uj \1 — u 

1 (u-v)'^ 



It follows that: 

(a) 5'„,(2) is (1/2 times) the square of the supremum form of the Anderson- 
Darling statistic (or, in its one-sided form, Tukey's "higher criticism 
statistic"; see Donoho and Jin [15] and Section 5). 

(b) 5„(1) is the statistic studied by Berk and Jones [5]. 

(c) 5'„(l/2) is (4 times) the supremum of the pointwise Hellinger divergences 
between Bernoulli(F„(rE)) and Bernoulli(-Fo(3;)); far as we know, this 
is a new goodness-of-fit statistic [as are all the statistics Sn{s) for s ^ 
{-1,0,1,2}]. 
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(d) <S'n(0) is the "reversed Berk- Jones" statistic introduced by Jager and 
Wellner [24]. 

(e) 5'n(— 1) is (1/2 times) a "Studentized" version of the supremum form of 
the Anderson-Darhng statistic; see, for example, Eicker [17], page 116. 

(f) T„(l) is the integral form of the Berk-Jones statistic introduced by 
Einmahl and McKeague [18]. 

(g) r„(2) is the classical (integral form of) the Anderson-Darling statistic 
introduced by Anderson and Darling [3]. 

Remark 2.1. Note that Ki/2-r{u,v) = Ki/2+riv,u) for r G M and u, 
V S (0,1), so the families of statistics Snis) and T„(s) have a natural sym- 
metry about s = 1/2. We will continue to use the "s-parametrization" of 
these families for reasons of notational simplicity. 

3. Distributions under the null hypothesis. 

3.1. Finite sample critical points via Noe^s recursion. Owen [36] showed 
how to use the recursions of Noe [35] to obtain finite sample critical points of 
the Berk- Jones statistic Rn = S'n(l) for values of n up to 1000. (See Shorack 
and Wellner [38], pages 362-366 for an exposition of Noe's methods.) Jager 
and Wellner [24] pointed out a minor error in the derivations of Owen [36] 
and extended his results to the reversed Berk-Jones statistic 5„(0). Jager 
[23] gives exact finite sample computations for the whole family of statistics 
via Noe's recursions for values of n up to 3000. (The C and R programs 
are available at the second author's website.) We will not give details of the 
finite-sample computations here but refer the interested reader to Jager and 
Wellner [24] and Jager [23]. See Jager and Wellner [24] and Jager [23] for 
plots of finite sample critical points and several finite sample approximations 
based on the asymptotic theory given here. 

During the revision of this paper we learned of an alternative finite-sample 
recursion for calculation of the null distribution of 5^(2) proposed by Khmal- 
adze and Shinjikashvili [31] which apparently works for n < 10^. Presumably 
this alternative recursion could be used for our entire family of statistics, 
but this has not yet been carried out. 

3.2. Asymptotic distribution theory for Sn{s) under the null hypothesis. 
Limit distribution theory for 5„(2) and Sn{—^) under the null hypothesis 
follows from the work of Jaeschke [22] and Eicker [17]; see Shorack and 
Wellner [38], Chapter 16, pages 597-615 for an exposition. These results are 
closely related to the classical results of Darling and Erdos [14]. Berk and 
Jones [5] stated the asymptotic distribution of their statistic i?„ = 5„(1). 
For details of the proof, see Wellner and Koltchinskii [43], with a minor 
correction as noted at the end of the proof here. Here we show that the limit 
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distribution of nSn{s) — is the same double-exponential extreme value 
distribution for all — 1 < s < 2, where 

r„ = log2 n + ^ logg n - i log(47r), 

with log2 n = log(log n) and log3 n = log(log2 n) . 

Theorem 3.1 (Limit distribution under null hypothesis). Suppose that 
the null hypothesis H holds so that F is the uniform distribution on [0, 1] . 
Then for — 1 < s < 2 it follows that 

nSn{s) - rn-^Y^r^ E^, 
where E^{x) = exp(— 4exp(— x)) = P(l4 < x). 

Define 



6n = Y21og2n, Cn = 21og2n+ilog3n- ilog(47r), 

1/-, ^ \/n\¥n(x) — x\ 

dn = n-\lognf, Zn= sup ^ "y ' . 

dn<X<l~dn V^(l-^) 

As will be seen, the proof involves the following four facts: Fact 1. — >p 

1. Fact 2. bnZn -Cn-^Y^r^ E^. Fact 3. (1/2)4/62 = r„ + o(l). Fact 4. 
nS„(s) = (l/2)Z2 + Op(l). 

In the ranges s > 2 and s < — 1 we do not know a theorem describing the 
behavior of the statistics Sn{s) under the null hypothesis. 

3.3. Confidence bands. Owen [36] showed how the Berk-Jones statistic 
Rn = 5'n(l) can be inverted to obtain confidence bands for an unknown 
distribution function F. Similarly, the family of statistics Sn{s) yields a new 
family of confidence bands for F as follows: given a continuous d.f. F on M, 
define 

r sup Ks{¥n{x),F{x)), ifs>l, 

Q , jps J — oo<a;<oo 

^n[S,I<) = < Ks{¥n{x),F{x)), if S < 1. 

[ X(i)<x<X{„) 

By the (inverse) probability integral transformation, Pp(S'„(s,-F) < t) = 
PpoiSnis) < t) for all t where 5„(s) is as defined in (1) and Fq is the uniform 
distribution on [0,1]. Hence, with qn{s,a) denoting the upper 1 — a quan- 
tile of the distribution of Sn{s) under Fq (which is computable via Noe's 
recursion as discussed in Section 3.1 or can be approximated for large n via 
Theorem 3.1), it follows that 



Pf(5'„(s,F) < g„(s,a)) = PpoiSnis) < qn{s,a)) = 1 - a 
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for each fixed a G (0, 1) and n. Hence, 

{F:Sn{s,F)<qn{s,a)} = {F:Ln{x;s,a)<F{x)<Un{s;s,a) for all x G M} 

yields a family of 1 — a confidence bands for F. Here Ln{x; s, a) and Un{x; s, a) 
are random functions determined by s, a, n and the data in a straightfor- 
ward way; see Owen [36], Jager and Wellner [24] and Jager [23] for details. 

3.4. Asymptotic distribution theory for T^i^s) under the null hypothesis. 
Limit distribution theory for T„(2) was established by Anderson and Darling 
[3]. Einmahl and McKeague [18] noted that this carries over to (for 
a proof, see Wellner and Koltchinskii [43]) and extended to other 

testing problems. Here we show that the limit distribution of nTn{s) is (1/2 
times) the Anderson-Darling limit distribution for all s £ [—1,2], namely, 
the distribution of 



where U is a standard Brownian bridge process on [0, 1] and Zi, Z2, ■ ■ ■ are 
i.i.d. A^(0, 1); see, for example, Shorack and Wellner [38], pages 224-227. 

Theorem 3.2 (Limit distribution of T„(s) under the null hypothesis). 
Suppose that the null hypothesis H holds so that F is the uniform distribution 
on [0, 1]. Then for —00 < s <2 it follows that nTn{s) -i- where A'^ is 

the Anderson-Darling limit defined in (3). 

We will not study the statistics T„(s) further in this paper, but intend to 
continue their study elsewhere. 

4. Limit theory under alternatives. The power behavior of individual 
members of our new family of statistics has previously been studied sepa- 
rately and somewhat in isolation: see, for example. Berk and Jones [5] (for 
the Berk-Jones statistic), Durbin, Knott and Taylor [16] and D'Agostino and 
Stephens [13] for r„(2) compared to other integral goodness-of-fit statistics 
and Nikitin [34] for treatment of Bahadur efficiencies for many goodness-of- 
fit statistics. Interest in these test statistics has received new impetus via 
the use of appropriate one-sided versions of the test statistic 5'„(2) in the 
context of multiple testing problems; see, for example, Donoho and Jin [15], 
Jin [28] and Meinshausen and Rice [33]. See Cay on, Jin and Treaster [8] for 
an interesting application to detection of non-Gaussianity in the cosmic mi- 
crowave background data gathered by the Wilkinson Microwave Anisotropy 
Probe (WMAP) satellite, and see Cai, Jin and Low [7] for further work on 
estimation aspects of the problem in connection with the developments in 



(3) 




8 



L. JAGER AND J. A. WELLNER 



Meinshausen and Rice [33] . The work of Owen [36] on confidence bands de- 
rived from the Berk- Jones statistic 5*^(1) was apparently motivated in large 
part by the Bahadur efficiency results of Berk and Jones [4] . 

It is clear from the results of Donoho and Jin [15] and earlier efforts by 
Revesz [37] to combine the strengths of the Kolmogorov and Jaeschke-Eicker 
statistics that tests based on any of the statistics Snis) will do best against 
"alternatives in the tails." As suggested by one of the referees of our paper, 
this may well be the "Achilles heel" of such test statistics, since the results 
of Revesz [37] suggest that our statistics will have no asymptotic power for 
a large class of "contiguous alternatives" (with departures from the null 
hypothesis "in the middle" of the distribution). On the other hand, having 
a family of statistics such as {S'„(s) : s G M} available gives the possibility of 
choosing (or "designing") a test with several desirable properties. We will 
return to this briefly in Section 6. 

Here we study convergence of the family of statistics to their "natural 
parameters" under fixed alternatives, comment briefly on the Bahadur effi- 
ciency results of Berk and Jones [5] in light of the results of Groeneboom 
and Shorack [19], and show that the optimal detection boundary results 
of Donoho and Jin [15] extend to the whole family of statistics Sn{s) for 
s £ [—1,2]. In spite of the negative results of Janssen [27] for goodness-of-fit 
statistics in general, much remains to be learned about the power behavior 
of the family {S'„(s)}. 

4.1. Almost sure convergence to natural parameter. Let Fq be the Uniform(0, 1) 
distribution function as in Section 2. The Kolmog orov statistic Dn — ll-^n — 
FqIIoo has the property that for any distribution function F, if Xi, . . . ,Xn 
are i.i.d. F, then 

Dn"^-\\F-Fo\\oo = d{F). 

We call d{F) = \\F — i*b||oo the natural parameter for the Kolmogorov statis- 
tic Dn- As Berk and Jones [5] pointed out for their statistic i?„ = ^^(l), 
under alternatives F the convergence 

Sn{l) = Rn= sup Ki(F„(x),x)"^- sup Ki{F (x) , x) = r{F) 

0<a;<l 0<x<l 

holds only under some condition on F (the exact condition will be given 
below), and for a slightly more extreme F, namely, what we call the "Poisson 
boundary distribution function," the behavior changes to convergence in 
distribution to a functional of a Poisson process rather than convergence 
to a natural parameter. Thus, Berk and Jones [5] showed that if F(x) = 
1/(1 +log(l/x)), then 

(4) Snil) = Rn ^ sup —— = — 

t>o t u 
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where N is a standard Poisson process and U ^ Uniform(0, 1). 

It turns out that in the range < s < 1 the statistics Sn{s) behave anal- 
ogously to the Kolmogorov statistic Dn under fixed alternatives. Namely, 
we show that in this range the statistics converge almost surely to their 
"natural parameter" for all d.f.'s F. 

Proposition 4.1. Suppose that Xi,...,Xn are i.i.d. F and that 0< 
s < 1. Then Sn{s) supo<^<i ifs(-F(x), x) = Soois,F). 

On the other hand, in the range s > 1 we have the following criterion for 
almost sure convergence of the statistics Snis) to their natural parameters: 

Proposition 4.2. Suppose that Xi,.. .,Xn are i.i.d. F and that s>l. 
Then Sn{s) supo<2:<i Ks{T{x),x) = <S'oo(s, F) if and only if F satisfies 

L {F-\u){l-F-\u)))(^-^)/^ 

By the (inverse) probability integral transformation, the convergence in the 
last display is equivalent to Ef[X{1 — < oo. 

As Berk and Jones [5] show, if for some 7 > 0, the distribution function 
F satisfies 

F{x) < {log(l/a;)(log2(l/x))i+n-i, x < 7, 

and 

1 - F{x) < {log(l/(l - x))(log2(l/(l - x)))'+^r\ X > 1 - 7, 

then Rn = Sn{l) supo<a.<i x) = Soo{l,F) = r{F) < co. It can 
be shown that this convergence holds if and only if /q — x)]^^ F{x)x 
(1 — F{x))dx < 00; see Jager [23] for details. We do not yet know sharp 
conditions for 5„(s) Soo{s,F) when s < 0. 

4.2. Poisson boundaries for s >1 and s < 0. As noted in the previous 
subsection, the statistic Rn = <S'„(1) has a "Poisson boundary" d.f. Fi for 

which Rn = Sn{l) ^ l/U rather than Rn = 5„(1) "-4- r{F) = 5oo(l, F). Here 
we note that this behavior persists for the entire range s > 1 and for s < 0. 

For each fixed s G [0, l)'^, define the distribution function Fg on [0, 1] by 
Fs(0) = and, for < x < 1, by 

(1+ 1 , l<s<oo, 

(l + log(l/x))-\ s = l, 

{I- s{x'-^ s<0. 
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Note that Fs{x) Fi{x) as s \ 1 for < x < 1. 

The foUowing proposition includes the result (4) of Berk and Jones [5] 
when s = 1, and it agrees with the case 6 = 1/2 of Theorem 2 of Jager and 
Wellner [25] when s = 2. As in (4), let N be a standard Poisson process, and 
let C/~ Uniform(0, 1). 

Proposition 4.3 (Poisson boundaries for s > 1 and s < 0). 
(i) Fix s>l and suppose that Xi, . . . ,Xn are i.i.d. Fg given in (5). Then 

,1/ N(t)V d 1 

Jn(s)^- sup- - 



s \t>o t J 

(ii) Fix s < and suppose that Xi, . . . are i.i.d. Fg given in (5). Then 



where Si = Ei is the first jump point of N. 

Remark 4.1. The distribution of supj> g^{t /N (t ) ) , which is also the lim- 
iting distribution of sup{(i/G„(t)) :t > C(i)} where Gn is the empirical dis- 
tribution function of n i.i.d. Uniform(0, 1) random variables Ci)---)^ni is 
given by 

P{ sup{t/N(t)) > X I = exp(-x) + -f x'^expf-Zcx), x > 1; 

see Wellner [40], pages 1008-1009 and Shorack and Wellner [38], page 412. 
This yields an explicit formula for the distribution of the random variable 
on the right-hand side of (6). 

Remark 4.2. Although the family of distributions Fg satisfies -Fs(x) — > 
exp(— (1/x — 1)) = Fq{x) as s /" 0, it appears that the natural limit in dis- 
tribution under Fq is 5„(0) 1 = supo<2.<i Kq{Fo{x),x) in this case, so ap- 
parently convergence to the natural parameter continues to hold under Fq. 

We do not know if there is a (more extreme) d.f. Fq for which S'n(O) ^(N) 
for a nondegenerate functional g a standard Poisson process N. 

Remark 4.3. F £ K has Poisson boundary behavior at both and 
1, then natural generalizations of Proposition 4.3 involving two indepen- 
dent Poisson processes can easily be proved. For example, if F is the stan- 
dard arcsin law with density 7r^^ii~^/^(l — u)~^^'^1(^q^i^{u), then Sn{s) 

(2/7r2)max{(sup^>o(N(^)/^))^ (supt>o(N(t)/t))^}, where N, N are indepen- 
dent standard Poisson processes. 
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4.3. Bahadur efficiency comparisons. Berk and Jones [5] studied the 
Bahadur efficiency of their statistic Sn{i) = Rn relative to weighted Kol- 
mogorov statistics based on the work of Abrahamson [1]. As pointed out 
by Groeneboom and Shorack [19], however, the Bahadur efficacies of the 
weighted Kolmogorov statistics are for weights heavier than the (quite 
hght) logarithmic weight function = — log(a;(l — x)) because the null 
distribution large-deviation result is degenerate for heavier weights. A second 
difficulty for Bahadur efficiency comparisons is that both the weighted Kol- 
mogorov statistics and the Berk-Jones statistic fail to converge almost surely 
to their natural parameters for sufficiently extreme alternative d.f.'s F, and 
as noted by Berk and Jones [5] for the Berk- Jones statistic and by Jager 
and Wellner [25] for the weighted Kolmogorov statistics, there is a certain 
"Poisson boundary" d.f. F for which the statistics converge in distribution 
to a functional of a Poisson process. Thus, comparisons of goodness-of-fit 
statistics of the supremum type via Bahadur efficiency are rendered difficult 
by breakdowns in both the large-deviation theory under the null hypothesis 
and by failure of the statistics to converge almost surely under fixed alterna- 
tives. Nevertheless, it would be interesting to be able to make comparisons 
where possible. 

To this end, we consider variants of our statistics in the range < s < 1 
with the supremum unrestricted as follows: 

S^^{s)= sup Ks{¥n{x),x), 

0<x<l 

5r'+(s)= sup K+(F„(x),x), 5r'-(s)= sup K-{¥n{x),x), 

0<a;<l 0<x<l 

where 

if < f < n < 1, 
K^{u,v) = {0, ifO<n<u<l, 

otherwise, 

if < n < < 1, 
K~{u,v) = {0, ifO<w<n<l, 

otherwise. 

Also, set K{u,v) = Ki{u,v) = ulog{u/v) + (1 — nlog((l — n)/(l — v)) for 
{u,v) G (0, 1)^ and K~^{u,v) = Kj~{u,v). Although we do not yet have large 
deviation results for the statistics Sn{s) or S^{s) = supx, .<x<x, Kt{^nix),x), 

( 1 ) — (n) 

we can establish the following large deviation results for Slf''^{s) and Slf'~{s). 

Theorem 4.4. Suppose that Xi, . . . ,Xn are i.i.d. with continuous d.f. 
Fq, the uniform distribution on (0, 1). Fix s € (0, 1). Then 

n^ilogPo(5r^(s)>«)^- inf K+ {t+ {x,a),x) 

0<x<l 
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(7) 

_ log[l - s(l - s)a\ ^ _^+, X 
1 — s 

for each <a < l/[s(l — s)], where t^{x, a) = inf{t : K^{t, x) >a}. Further- 
more, 

n-^ log PoiSZ'- {s)>a)^- inf (r," (x, a),x) 

0<x<l 

log[l - 5(1 - s)a] _ _ 
= = -9s («) 

for each a > 0, where t~ {x, a) = sup{t : K~ (t, x) > a}. 



Combining Theorem 4.4 with Proposition 4.1, we have the fohowing corol- 
lary for the Bahadur efficacies of the statistics S'^^'^(s) and 5'^'''~(s) with 
0<s< 1. 



Corollary 4.5. Let F be a continuous distribution function on [0,1]. 
Then the Bahadur efficacy o/5"^'"'"(s) at the alternative F is 

et{F)=gt{Si{s,F))=gt{S^{s,F)), 

where g'^ is defined in (7) and S^{s,F) = supQ^^^iK'^{F{x),x). 

Remark. Note that linisy'i gf (a) = a, in agreement with Theorem 2.2, 
page 50, of Berk and Jones [5]. 

Remark. Since gf{a) = gj{a) ~ sa as s \ 0, the Bahadur efficacies 
of the statistics ^"''^(s) tend to be smaller than the efficacies of the Berk- 
Jones statistic 5*^(1, F) = r{F) (when the latter exists), and especially so for 
small s. This, together with extensive numerical computations of Jager [23], 
strengthens the case in favor of the statistics S:^{s) = supj!f^^j<^.<jjC(„) ^ 
(¥n{x),x) with restricted supremum. Unfortunately we do not yet know the 
large deviation behavior of these statistics with restricted supremum. 

5. Attainment of the Ingster Donoho Jin optimal detection boundary. 

Jin [28] and Donoho and Jin [15] consider testing in a "sparse heterogenous 
mixture" problem defined as follows: Suppose that Yi, . . . ,Yn are i.i.d. G on 
M and consider testing 

Hq ■.G = ^, the standard A^(0, 1) distribution function 
versus 



Hi:G={l-e)<^ + e^i- - fi) for some e G (0, 1), /i > 0. 
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(n) 

In particular, they consider the n-dependent alternatives given by 

(8) //J") : Gn = (1 - e„)$ + - fXn) for e„ = n"^, = V^rlogn, 

where 1/2 < /? < 1 and < r < 1. By transforming to Xj = 1 — i.i.d. 
F = 1- G($-^(l - •)) (with the Xi's taking values in [0,1]), the testing 
problem becomes test 

Hq : F = Fq, the Uniform(0, 1) distribution function 
versus 

Hi:F = Fo{u) + e{(l - u) - $(^>"i(l - u) - /x)} > Fo(n). 

[The corresponding n-dependent sequence is Fn{u) = u + en{{l — u) — <I>(^>^^(1 - 
u) — Uri)} with the same choice of e„ and Hn as in (8).] Donoho and Jin 
[15] consider several different test statistics, among which the principal con- 
tenders are Tukey's "higher criticism" statistic HC^ defined by 

HC:^ sup V^(^r.{x)-x) 



for some oq > (they seem to usually take = 1/2), and a one-sided ver- 
sion of the Berk-Jones statistic BJ^ = nsupX(j^^<x<i/2 Ki{¥n{x),x), where 
K+{u,v) = Ks{u, v)l{0 <v <u<l}. 

Jin [28] (see also Ingster [20, 21]) showed that the likelihood ratio test of 

(n) 

Hq versus has a "detection boundary" defined in terms of the param- 
eters (3 G (1/2, 1) and r G (0, 1) involved in (8) which is described as follows: 
set 



'(3-1/2, 1/2 < (3 < 3/4, 

(l-^/^^)^ 3/4</3<l. 

Then for r > p* {(3) , the likelihood ratio test (which makes use of knowledge 

of (3 and r) is size and power consistent against h["^ as n ^ oo. Donoho 

and Jin [15] show that the tests of Hq versus ijf^ based on HC* and BJ+ 
are also size and power consistent as n — > oo and that both of these tests 
dominate several other tests based on multiple comparison procedures such 
as the sample range, sample maximum, FDR (False Discovery Rate) and 
Fisher's method; see, for example. Figure 1 of Donoho and Jin [15] and 
their Theorems 1.4 and 1.5. 

We show here that the tests based on appropriate one-sided versions of 
the statistics 5„(s), namely, 

nS^{s)=n sup K'^ {¥n{x),x), 

X(i)<a;<l/2 
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have the same detection boundary for testing Hq versus as the statis- 
tics HC* and BJ^. More formally, define a function Ps{f3) such that if 
= y/2r logn and if we use a sequence of levels a„ — > slowly enough 
[slowly enough so that with qn{s,Q) as defined in Section 3.2, satisfies 
nqn{s, On) = (1 + o(l)) log logn], then for r > Ps{(3), the resulting sequence of 
tests has power tending to 1 as n — > oo, while for r < Ps{(3), the sequence of 
tests has power tending to zero. In terms of the functions Ps{P), our theorem 
is as follows. 

Theorem 5.1. For each s €[-1,2], p,{(3) = p* {fj) for 1/2 < (3 <l. 

While this may not be too suprising for 1 < s < 2 in view of the Donoho- 
Jin results for nS+{l) = BJ+ and nS+{2) = {l/2){HC^f, it seems new 
and interesting for s G [— 1,1). Figure 1 gives smoothed histograms of the 
values of the statistics nSn{s) — rn under the null hypothesis Hq (solid line) 
and under the alternative hypothesis (dotted line) for 7i = 0.5 x 10^, 

r = 0.15 and /3 = 1/2. This should be compared with Figure 2 on page 978 
of Donoho and Jin [15] showing values of HC* and HC^ , corresponding to 
our s = 2; their HC^ = sup;^/„<2,<;^/2 V^i^nix) — / \/x{l — x). 

6. Discussion and some further problems. In Section 4 we have shown 
that the statistics {5„(s):s G M} behave quite differently for < s < 1, 
for s < and s > 1. In particular, for < s < 1, the statistics 5„(s) con- 
verge almost surely to their natural parameters for fixed F ^ Fq. Moreover, 
the different Poisson boundary behaviors for s < and s > 1 suggest that 
the statistics Sn{s) with s > 1 are geared toward "heavy tails," while the 
statistics Sn{s) with s < are geared more toward "light tails." This also 
becomes apparent from plots of the functions Ki{Fi{x),x), Ki{Fq{x),x), 
and of Kq{Fi{x),x) and Ko{Fo{x),x), where Fi{x) = 1/(1 + log(l/x)) and 
Fq{x) = exp(— (1/x — 1)); see, for example, Jager [23], page 11. 

In Section 5 we have shown that all of the statistics {Sn{s) : — 1 < s < 2} 
have the same optimal detection boundary for the two-point normal mix- 
ture testing problem considered by Donoho and Jin [15]. Thus, we have 
some flexibility in designing a test to detect these subtle tail alternatives, 
and yet behaving very stably under fixed alternatives (in the sense of al- 
ways consistently estimating a natural parameter). Thus, it seems that the 
(Hellinger-type) statistic S'„(l/2) may be a very reasonable compromise test 
statistic. 

Here is a brief listing of some of the remaining open problems: 

• Problem 1: What is the limit distribution of Sn{s) under the null hypoth- 
esis when s < — 1 or s > 2? 
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"1 1 1 1 1 1 r 
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General family, s = 0.5 




n 1 1 1 1 1 r 
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"1 1 1 1 r 
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Berk-Jones 




o 
o 
d 



1 1 1 1 1 r 
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/■= 0.1 5, fi = 1/2, n = 500 000, reps = 200 



Fig. 1. Smoothed histograms of (reps) values of the statistics Sn{s) under the null hy- 
pothesis Ho {solid line) and alter-native hypothesis H["^ {dotted line) with r = 0.15, (3—1/2 
for 8 = 0,0.2,0.5,1. 



Problem 2: What are necessary and sufficient conditions for Sn{s) to con- 
verge to its natural parameter under fixed alternatives F for s < 0? 
Problem 3: What is the large deviation behavior of Snis) under the null 
hypothesis for < s < 1? 

Problem 4: Is there an appropriate contiguity theory for the statistics 
Snis)'^ (The only example involving something similar of which we are 
aware is Theorem Al of Bickel and Rosenblatt [6], but their results do 
not seem to apply to the statistics Sn{s).) 

It is fairly easy to construct versions of our statistics Sn{s) in more general 
settings by replacing the intervals [0,x] and {x, 1] with sets C and for 
C in some class of sets C. Then for testing Hq ■.P = Pq versus Hi '.Pi ^ Pq, 
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a natural generalization of the statistics Sn{s) is 

Sn{s,C) = sup Ks{Fn{C),Po{C)), 
CdC 

where P„, is the empirical measure of Xi, . . . ,X„ i.i.d. P. 
• Problem 5: Do the statistics S'„(s,C) have reasonable power behavior for 
some of the "chimeric alternatives" of Khmaladze [30] for some choice of 
C? 



7. Proofs. 



7.1. Proofs for Section 3. 



Proof of Theorem 3.1. We first carry out the proof for -1 < s < 1, 
and then indicate the changes that are necessary for 1 < s < 2. Fix s € 
[-1,1). Note that 

d , , . f u\ / 1 — 

- 4>s\ 



du 



Ks{u,v) 



1-v 



</),(l)-(/.,(l)=0 



and 



a2 



■K{u, v) 



'I 

dv? ' \v ) V \1 — V J 1 — V 
Hence, it follows by Taylor expansion of u ^ Ks{u,v) about u = v that 



l-uV~^ 1 



= Ds{u,v). 



d 

Ks{u,v) = Ks{v,v) + —Ks{u,v) 



1 52 



{u-vf 



u=u* 



= + + ^(n- vfDs{u\v) = \{u- vfDs{u\v) 
for some u* satisfying \u* — v\ < \u — v\. This yields 

(9) Ks{¥nix),x) = ^{¥n{x) - xfD,{¥l{x),x) 

for < j; < 1, where |F* (rc) - x| < |F„(x) - x|; that is, x < F* (x) < ¥n{x) on 
the event x < ¥n{x) and F„(x) < F* (x) < x on the event F„(x) < x. 
We can write (9) as 

(10) Ks{¥n{x),x) = " / / {1 + x{l - x)Ds{¥Ux),x) - 1}, 

2 x[l — X) 

where 

|Rem„(x)| = \x{l - x)Ds{¥l{x),x) - 1| 

.(1 - x)( f !^v^^i + f i^^y^^^i - 

IV X J X V I — X J \ — x\ 
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(1-x) 



<(l-x: 



X 



2~s 



2-s 



1 — X 



2-s 



1-F*(x), 

1 — X 



+ x 



1 - F* (x) 



- {\ — x) — X 

2-s 

- 1 



Fix 6 G (0, 1/2). Now for xe[5,l-6], F„,(x) G [6/2, 1 - 6/2] a.s. for n > N^, 
so, much as in Wellner and Koltchinskii [43], 

sup 

S<x<l~5 

For < < 1/2 and 1 < s < 2, the function u ^ Ds{u,v) is monotone for 
u G (0, 1/2], while for < w < 1/2 and — 1 < s < 1, the function Ds{u,v) 
is monotone for u G (0, b{v, s)], where b{v, s) = 1/(1 + c{v, s)) with c{v, s) = 
((l-?;)/t;)(i-^)/(3-^), so 

x{l - x)Dsi¥l{x),x) < x{l - x){D,{x,x)VDsi¥4x),x)} 

on the set {F„(x) < 1/2 A b{x,s)}. Since P(F„(5) > 1/2 A b{6,s)) for 
6 < 1/2 and s G [-1,2], we get 

2-s 



sup |Rem„(3;)| < sup 



X{i)<a;<5 



(11) 



X(i)<x<(5 



+ sup 



X 



Fn(x) 



1 



X(i)<x<5 



1 - X 



2-s 



,1-F„(x), 
= Op{l) + Op{l) = Op{l) 

by Shorack and Wellner [38], inequality 1, page 415, and inequality 2, (10.3.6), 
page 416. Alternatively, see Wellner [42], Lemma 2, page 75, and Remark 
l(ii). Similarly, 

2- 



(12) 



sup |Rem„(a;)| < sup 

l-5<a;<X(„) l-<5<x-<X(„) 



+ sup 

l-5<x<X, 



Fn(x) 



1 



(n) 



1 — X 

1-F„(x) 



2-s 



Now we write 



Op{l) + Op{l) = Op{l). 



Sn{s) = Sn{s, I) V Sn{s, II) V 5„(s, ///), 



where 



Sn{sJ)= sup Ks{¥n{x),x)= SUp \{¥ n{x) - xf D s{¥n{x) , x) 



&<x<l-5 



&<x<l-5 
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Snis,II)= sup Ksi¥nix),x) = l sup (F„ (x) - x)'z), (F;(x) , x) , 
X(i)<a;<5 ^(i)<2.<<5 

Sn{s,III)= sup Ers(Fn(a:^),a:;) = ^ sup (F„(a;) - x)^i:>s(Fji(a^), a;). 

1— <5<x<X(„) l~<5<a;<X(„) 

By the monotonicity of u i-^ Ds{u,v) for u < 1/2 again, with probabihty 
tending to 1, 

Sn{s,n)<\ sup {¥n{x)-xf{Ds{¥n{x),x)yDs{x,x)} 
X(i)<x<(5 

>i sup (F„(x)-x)=^{D,(F„(x),x) A2?,(x,x)}, 

X(i)<x-<<5 

and similarly, by the monotonicity of u Ds{u,v) for 1/2 < n < 1, 

5n(s,///)<i sup {¥n{x)-xf{D,{¥n{x),x)VDs{x,x)} 
1— (5<x<X(„) 

>i sup (F„(x)-x)2{D«(F„(x),x) AZ?,(x,x)}. 

l-<5<x<X(„) 

For Sn{s,I), 

Q f T\ ^ i^n{x)-x)'^ _l/2., 

2 5<x<l-<5 x{l-x) 

<l sup (F„(x)-x)2{D,(F„(x),x) VL),(x,x)}{l + Op(n-i/2)}. 

^ 5<x<l-5 

In the second region the argument above leading to (11) yields 

Sn{s,II)<l sup {¥n{x)-xf{Ds{¥n{x),x)VDs{x,x)} 

>i sup (F„(x)-x)^{i?,(F„(x),x) AZ?,(x,x)}, 

X(i)<x<5 

and similarly for Sn{s,III). It follows that 

1 2 

5'„(s) < - sup (F„(x)-x) 
^ X(i)<z<js:(„) 

x{Z),(F„,(x),x) VZ),(x,x)}{l + 0p(n-V2)} 

(13) 

= i sup |i^!iM^{lVx(l-x)Z).(F„(x),x)} 
2X(i)<x<X(„) I x(l-x) 

x{l + Op(n-i/2)} 



GOODNESS-OF-FIT VIA PHI-DIVERGENCES 



19 



and, on the other hand, 

Sn{s)>l sup {¥nix)-xf{Dsi¥nix),x)ADs{x,x)} 

^ ^(l)<2-'<^(„) 

x{l + Op(n-i/2)} 

(14) 

\ |(E!i(^l^{iAx(l-x)Z?.(F„(x),x)}j 
2X(i)<x<X(„) I x{\-x) J 

x{l + Op(n-i/2)}. 

Now we break the supremum into the regions [dn,! — f^n] and 

[1 — with dn = ilogn)^ /n for any k>l. Then we have 



(F„(x)-a;)2 



^2^ 



sup —=Op^^^,, 

X(i)<x<d„ ^(l-x) 



where 6„ = ^21og2 n; see Shorack and WeUner [38], (26), page 602. More- 
over, 

sup \xil- x)Ds{¥n{x),x)\ = Op{l), 



X(_l^<X<dn 



SO 



(F„(x) - x) 



2 



(15) n sup -^{l#x{l-x)Ds{¥n{x),x)) = Op{b: 



for 7^ = A or 7^ = V, and similarly for the region [1 — X^^)] by a symmetric 
argument. On the other hand, if we define 

^|F„(x)-x| 

Zn = sup 



dn<X<l~dn V^(l - X) 

then, for /c > 5, 

(16) f^^l 

On 

and 

(17) - c„ 4 y4 ~ El 

where c„ = 21og2 ?i+ (1/2) log3 n— (1/2) log(47r) (see, e.g., Shorack and Well- 
ner [38], page 600, (16.1.20) and (16.1.17)). [Note that for the middle bracket 
in (15) we have 

2^s / ^2~s 



x{l-x)Ds{¥n{x),x) = {l-x)[——] +x 



¥nix)J Vl-F„(x) 
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SO 



sup x{l — x)Ds(¥n{x),x) < i sup 



X(i)<2;<X{i) 



X(i) <x<d 



2~s 



+ Op(l), 



SO by Wellner [42], Remark 1, the probability of large values of the main 
term can be bounded by 

2-s 



P 



X 



sup 



>\]=P 



sup 



> 



A1/(2-s) 



<P[ sup 

^(1) 



X,n<x<l lF„(x) 



> 



<e-AV(2-^)exp(-AV{2-^)). 



Furthermore, 
(18) 

almost surely where 



¥n{x) - X 



X 



0{an) 



2 _ log2 " _ log2 " 



ndr, 



(logn) 



0; 



see Shorack and Wellner [38], page 424, (4.5.10) and (4.5.11). It follows 
from (13), (14) and (15)-(18) that 



nSn{s) 



sup ^ " ' . ' (l + Op(a„))Vop(6^) 
d„<x<i-d„ x(l-x) 



(19) 



x{l + Op(n"i/2)} 



{z2vop(62)} + Op(l). 



Hence, we can write 



It follows that 



nSJs) 



\ZI = \{Zn- Cn/hn){Z^ + C„/6„) + 

— ^h (7 r Ih \ ^-n- + (^n/bn 1 



1< 

2 
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= hrXZn - Cn/^n) ^"+^'y^" V {o,{hl) - + 0,(1) 

(20) 

= h^iZn - cM^2llhl±2llML V (op(l) - 1/2)62 + ^^(1) 

4y4^v{-oo} = y4; 

here we used c^/b'^^ 6^ in the second equality. Since 
lc2 

= log2 n + (1/2) log3 n - (1/2) log(47r) + o(l) = r„ + o(l), 

this yields 

(21) P{nSn{s) — rn < x) ^ exp(— 4exp(— x)), 

and completes the proof of Theorem 3.1. Note that the centering c^/(26^) 
emerges naturally in the course of this proof. This completes the proof for 
the case s S [—1, 1). For 1 < s < 2, there are two additional terms that enter, 
and both of these are Op(6^) from the arguments in the previous section. 
The case s = 2 is easy since in this case v{l — v)Ds{u,v) = 1 for all u, while 
the result was stated for the case s = 1 by Berk and Jones [4] and proved 
in Wellner and Koltchinskii [43]. (Wellner and Koltchinskii [43] incorrectly 
claim (page 324) that K{¥n{x),x) = if x < ^(i); in fact, the supremum 
over this region is stochastically bounded and, hence, can be neglected.) □ 

Proof of Theorem 3.2. The fact that nT„(2) = Al/2 A^/l is clas- 
sical; see Shorack and Wellner [38], page 148. That nT„(l) was noted 
by Einmahl and McKeague [18] and proved by Wellner and Koltchinskii [43]. 
The proof for s 7^ 1, 2 proceeds along the same lines as the proof in Wellner 
and Koltchinskii [43] for the case s = 1, and hence will not be given here. 
For details, see Jager [23] or Jager and Wellner [26]. □ 

7.2. Proofs for Section 4. 

Proof of Proposition 4.1. We first prove the claim for the "unre- 
stricted version" of the statistics Slf{s) defined by Slf{s) =svlpq^^^iKs x 
{¥n{x),x), and then show that the difference between 5„(s) and Slf{s) is 
negligible. Now for s £ (0, 1) and Cg = l/(s(l — s)), we have 

\s:ris)-Soo{s,F)\ 

<Cs sup \{l-Wn{xyx^-' -{l-¥n{x)Y{l-xf-'} 
0<x<l 

- {1 - F{xyx^-' - (1 - F{x)Y{i - xy-']\ 
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+ sup|{(l - ¥n{x)Y - (1 - F{x)y}x'~'\\ 

X J 

< cisup |F„(x)^ - F{xy\ + sup|(l - E„(x))^ - (1 - F{x)Y\] 

\^ X X J 

Thus, the proposition wih be proved if we show that 
(22) 5r(s)-5„(s)'^-4-0. 
Now write S^{s) = max{i?„, M„, where 

Mn= sup Ks{¥n{x),x) = Sn{s), 
X(i)<x<X{„) 



Ln= sup ifs(lFn(a;),a;) 

and 



= sup^>^^^j Ks (F„ (x) , x) . 

Note that 

rO, ifM„>L„Vii„, 

- = max{L„, M„, - M„ = <^ L„ - if L„ > M„ V 

Now set 

ao = ao(F) = sup{x : F{x) = 0} > 0, 
ai = ai{F) = M{x : F{x) = 1} < 1. 

Note that 

Ln = sup if,(F„(x),x) = — 

^ {l-{l-aoy"'}^lo{s,F), 



s{l - s 
and, on the other hand, 

Mn= sup K,(F„(x),x)>K,(F„(X(i)),X(i)) = K,(l/n,X(i)) 

X{i)<2;<X(„) 

^ -{1 - - (1 - l/nYil - ^ 



^{l-(l-ao)^-n = ^o(^,F). 
s(l-s) 
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Similarly, 



Rn= sup Ksi¥nix),x) = ^^^^{l-Xl-f} 



x>X. 



(n) 



1 {l-a|-}^ro(5,F), 



s{l-s) 
while 

Mn= sup i^,(F„(x),x)>K,(F„(X(„)-),X(„))=/f,(l-l/n,X(„)) 

X(i)<x<X{„) 



- (1 - ^/nyxl-; - (l/n)^(l - ^ <, 

1 



By combining these pieces, it follows that 
< - Sn{s) = max{L„,M„,i?„} - M„ 

0, if M„>L„Vi?„, ] r 0, ifM„>L„Vi?„, 

L„ - M„, if L„ > Mn yRnA<{ Ln " L° , if Ln>L°\/ Rn, 
Rn-Mn, if i?n > A/„ V L„ J I ii„, - if ii„, > V L„ 

This shows that (22) holds and completes the proof. □ 

Proof of Proposition 4.2. Recall that when s = 2 such a condition 
follows from Theorem 3 in Jager and Wellner [25]: taking 6=1/2 and ap- 
plying continuous mapping, we conclude that 

5„(2)^- sup K2{F{x),x) if and only if E{[X{1 - X)]~^^^} < oo. 

0<x<l 

Similarly, for 1 < s < oo, 

1 



Sn{s) = sup {F„(x)V-^ + (l-F„(x))'^(l-x)i-^-l} 

0<x<l 

"-4- sup {F{xyx^-' + (1 - F{x)y{l - x^-' - !}■ 



sis-1) 
1 



0<a<l sls-l) 
if and only if 

||F„(x)"x^-^ - F{xYx^-'\\ 

and 

11(1 - F„(rE))^(l - xf-^ - (1 - F(x))^(l - xf-^\\ ''^ 0, 
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if and only if 



and 



if and only if 



and 



^(s-l)/s 

l-Fn(x) 

Gn{F{x)) 
l-Gn{F{x)) 



F{x) 

„(s~l)/s 



1 - F{x) 

F{x) 
1 - F{x) 



0, 



^■0 



0. 



Since 5((n) = is uniformly continuous on bounded sets, these last two 
convergences occur if and only 



GniF{x)) Fix) 



X 



{s-l)/s 



r{s-l)/s 







and 



l-G„(F(x)) l-F(x) 



These, in turn, hold if and only if 



i{u) - u 



and 



1 - Gn{u) -(I-U) 



(l-F-i(n))(^-i)/^ 



0. 



But in view of Wellner [41], these convergences hold if and only if F satisfies 

■ an < oo. 



{F-^{u){l-F-^{u)))(^-^)/' 



By the (inverse) probability integral transformation, the convergence in the 
last display is equivalent to E[X{1 — X)]^-^^'^^^'^ < oo. This completes the 
proof of the claimed equivalences. □ 



Proof of Proposition 4.3. For s = 1, this follows from Berk and 
Jones [5], pages 55-56. Thus, it suffices to prove the claimed convergences 
for s > 1 and s < 0. 
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For s > 1, fix a G (0, 1). We begin by breaking the supremum over (0, 1) 
into the regions < Fs{x) < n^", n~" < Fs{x) < 1 — and 1 — < 
Fs{x)<l: 

Sn{s) = sup {WnixYx'-' + (1 - F„(x))"(l - x)'~' - 1}^^ 

0<a;<l ^(S-Ij 

sup {¥^{xyx'-' + (1 - F„(rE))"(l - x)'-' - 1}^^ 

x:0<Fs{x)<n-°' ^{^ ~ ^) 

V sup {F„(x)V-^ + (l-F„(x)r(l-x)i-^-l}— ^ 

x:n-''<Fs{x)<l~n-" ^{'^ ^) 

V sup {¥^{xrx'-' + (1 - F„(x))"(l - - 1}^\t 

x:l-n-°'<Fs{x)<l ~ '-) 

= /„,(s) V//„(s) V///„,(s). 

For the main term, Inis), let G„ be the empirical d.f. of n i.i.d. Uniform(0, 1) 
random variables and use F„ = Gn{Fs) to write 



s{s-l)Inis)= sup 



GniFsix)) 



0<Fs(x)<n-a I V Fs{x) 

l-Gn{Fsix)) 



+ 



l-Fs{x) 



FsixYx'-' 

(l-F,(x))^(l-x)i-^-U, 



where 



sup 

0<t<n- 



l-s 



FsixYx 

uniformly in x G [0,n~"], while 



l + (xi-^-l)/(s-l) 



s-l 



sup 

x:Fg {x)<n~ 



1-GniFsix)) 



1-Fs{x) 



and 



sup \{l-Fs{x)Y{l-xY-' -l\^0. 

x:Fs(x)<n-" 

Combining these last five displays shows that Inis) s~^sup^^Q(N{t)/ty; 
note that the limit variable is >l/s almost surely. 
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To handle the term IIn{s), write 
Ilnis)^ sup 



+ 



Fs{x) 
l-G„(F,(x)) 



1-Fs{x) 
x(l-F,(x)r(l-rE)i-^-l| 



.(.-1)' 



where now the two terms involving the ratio of the empirical d.f. to the true 
d.f. Fg converge almost surely to 1. Hence, we conclude that 

IIn{s)"^' sup Ks{Fs{x),x) = -, 

0<x<l s 

where the equality follows after some calculation. Finally, it is easily shown 
that ///„(s)'^-0. 

For s < 0, fix a G (0, 1). We begin by breaking the supremum over (0, 1) 
into the regions X^^i-^ < x < F~^(n^°'), F~^(n^") < x < -^,"^(1 — and 
F-Hl-n-")<x<X(„): 

Sn{s)= sup {Fjx)V-^ + (l-F„(x)r(l-x)i^^-l} ^ 



sis-1) 



sup {F,(x)V-^ + (l-F„(x)r(l-x)i^^-l}- ^ 



3;:X(i)<x<F7^(n-") ^) 

V sup {¥n{xyx^-' 

x:Fr'^(n-°')<x<Fr'^{l-n-") 

1-s 1 



+ {l-Yn{x)ni-xy-^-l} 



s{s-l) 

V sup {¥n{xyx'^^' 

x:Fr'^{l-n-°')<x<X(^„) 

+(i-F„(x)r(i-x)i-^-i}- ^ 



sis-l) 
= I„,(s) V//„(s) V///„,(s). 

For the main term, /„,(s), let G„ be the empirical d.f. of n i.i.d. Uniform(0, 1) 
random variables and use F„ = G„(Fs) to write 

^ sup if^M^'^FAxYx'-^ 
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l-FJx) 



l-Gn{Fs{x)) 

x(l-F,(x))^(l-x)i-^-l 



where 



sup = sup sup 



x:X^,^<x<F-\n-^)y^n{Fs{x)) J ^^,^<t<n-'-\Gn{t) J t>5i VN(i) 

F,(x)^xi^^ = x^~'{l - s(a;-(i-^) - 1)) ^ -s 
uniformly in x G [0,F~^{n~°')], while 



sup 

x:Fs{x)<n- 



l-Fs{x) ^ 



l-Gn{F,{x)) 



and 



sup |(l-F,(x))'(l-x)^-'-lHO. 

x:Fs{x)<n~" 

Combining these last five displays shows that /„,(s) (1 — s)~^ sup^yg_^{t/ 
N(t))~*; note that the limit variable is > 1/(1 — s) almost surely. 
To handle the term IIn{s), write 

Ilnis) ^ sup \( jM^) ~ F,(x)«x1- 

n-"<Fs(x)<l-n-" ' 



+ 



ln{Fsix)) 

l-Fs{x) 

l-Gn{Fs{x)) 

X (l-F,(x)r(l-x)i-^-l 



s{s-iy 



where now the two terms involving the ratio of the empirical d.f. to the true 
d.f. Fg converge almost surely to 1. Hence, we conclude that 

IIn{s)"^' sup Ks{Fs{x),x) = 

0<x<l i — S 

where the equality follows after some calculation. Finally, it is easily shown 
that ///„(s)'^-0. □ 

To prove Theorem 4.4 and its corollary, we will use the following lemma 
from Chernoff [9]. 

Lemma 7.1. Let Xi,X2,--- be i.i.d. with continuous distribution Fq. 
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(a) Ift<Fo{x), then n^^ log P{¥nix)<t)^-K^{t,Fo{x)). 

(b) Ift>Fo{x), then n^^ log P{¥n{x) >t) ^ -K+{t,Fo{x)). 

In both cases, the convergence is from below. 

Proof. This follows from Theorem 1 of Chernoff [9]. □ 

Proof of Theorem 4.4. We first prove the theorem for Since 
Kf[t,x) is continuous in t and strictly increasing on (x, 1), then for < a < 
(1 — x^~*)/[s(l — s)], there is a unique r = t{x) in (x, 1) for which K^{t, x) = 
a and {t : Kt{t,x) > a} = [t,oo). If a > (1 - x^~'')/[s{l - s)], then r = 1 
necessarily. 

For any fixed x G (0, 1), we have 

-logP(5r'+(s) >a) > -logP(K+(F„(x),x) >a) 
n n 

= - log P(F„(x) > r) ^ -K+{t, x) 
n 

by Lemma 7.1. So 

(23) liminfilogP(S7'+(s)>a)>- inf K+{t{x),x). 

Now consider the reverse inequality. Let sup^|j denote the supremum for 
^{i) <x< Xf^^j^iy Since F„(x) = i/n on this range, we have 

supir+(F„(x),x) =K+(i/7i,X(,)) Vi^+(Vn,X(,+i)) =i^+(i/n,X(i)). 

x\i 

Note that for x < -^(i), we have F„(x) = 0, and so i^+(F„(x), x) = also. So 
we can write 5^'''+(s) = maxi<i<n{K+(z/n, X(j))} = maxi<i<„{ir+(F„(X(j)), 
X(j))}. Now, using monotonicity of r, 

^■logP{Sr+{s)>a) 



n 



- log P f max {R't (F„ ) , ) } > a 



<-logJ2 P{Kt (F„ (X(,) ) , ) > a) 
1 " 

= -log^P(F„(X(,))>r(X(,))) 
n . -, 

-I n 1 " 

= -log5]P(Vn> r(X(,))) = - log ^P(T-Hi/n) > X(,)) 

i=l 1=1 
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1 " 

< -log^P(F„(r-i(i/n)) >F„(X(,))) 

1=1 

= - log f2Pi^n{r~\i/n))>i/n) 
1=1 

n 

' -nK+{i/n,T-^{i/n)) 



<-log^e-"-^ W"'^ [by Lemma 7.1(b)] 

^ i=i 

1 rt n 

< L\Qg\^ Q~''^^^^^<i<-n.K+{i/n,T-'^{i/n)) ^ _^Qgy^g-ninfo<j;<i-ft:+(x,T-i(x)) 

1=1 «=1 

n n 
1=1 

= - inf K+(r(x),x) + i^. 
So we conclude that 

(24) limsup-logP(5r'+(s)>a)<- inf K+(t(x),x). 

n^oo n 0<x<l 

Combining this last display with (23) yields the convergence part of (7). 
To prove the explicit formula for gf, note that K'^{t'^{x),x) is a decreasing 
function of x until x = [l — s(l — s)a]^^^^~^\ where {t~^ {x),x) = oo. Thus, 

inf K+(t+(x),x) 

0<a:<l ^ \ n / 

= limi^+(r+([l - s(l - s)a]i/(^-^) - e), [1 - s(l - s)a]^/^^-'^ - e) 

= -log[l-s(l-s)a]/(l-s), 

so the given formula for the infimum in (7) holds. This completes the proof 
for 5"'^''^(s). The proof for S'^''''~(s) is analogous using Lemma 7.1(a). □ 

7.3. Proofs for Section 5. The following lemma extends Lemma A. 4, 
page 988, Donoho and Jin [15]. 

Lemma 7.2. (i) ForO<v<u< 1/2, and -l<s<2, 

(25) Kt{u,v)<l^-^^-^ = K2{u,v). 

2 v[l — V) 

(ii) Let 1 < s < 2 and v = v{u) satisfy <v <u <1. Then, as u — > 0, 

K2{u,v)[l + 0(1 - (v/uf-^) V 0(((1 - v)/{l - u)f-' - 1)], 
K(uv) = l ifu/v^l, 
' ^ ^ vM^/v){l + 0(1)) = u{{u/vr-' - s}{l + o{l))/{s{s - 1)), 

if u/v ^ oo. 
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(hi) Let s = 1 and v = v{u) satisfy < v < u < 1. Then, as u^O, 

K ^,._[K2{u,v)[^ + 0{u+{u/v)-l)l ifu/v^l, 
'"^^ \ulog(n/t;)(l + o(l)), ifu/v^oo. 

(\v) Let s £ [— 1, 1) \ {0} and v = v{u) satisfy < v < u < 1. Then, as 0, 

( K2iu,v)[l + 0(1 - (v/uf-^) V 0(((1 - v)/il - u)f-' - 1)], 
Ks{u,v) = l ^ ^fu/v^l, 

u(l + o(l)), ifu/v^oo. 

y 1 — s 

(v) Let s = and v = v{u) satisfy < v < u < 1. Then, as — > 0, 

r K2iu,v)[l + 0(1 - (v/u)') V 0(((1 - v)/{l - u)f - 1)], 
Ko{u,v) = < ifu/v^l, 
lu(l + o(l)), ifu/v^oo. 

Remark. Note that for l<s<2, as u^O and u/v oo, 
v4>siu/v)=vi^{l- s) + s^- (^^^ }s(T^ 



s{s — 1) {\v 



u 



s) 

+(.-i)--4 

u J 
'-^(1 + 0(1)), 



s{s — 1) {\v , 

where the right-hand side converges to nlog(u/f )(1 + o(l)) as s \ 1. 

Proof of Lemma 7.2. (i) Letting u = tv, it suffices to show that for 
< w < 1/2 and 1 < t < 1/(2?;), 

K,{tv,v)<-^ ^-v 

2 1 — V 

or, equivalently, since Ks{tv,v) =v(f)s{t) + (1 — v)cj)s{il —tv)/{l —t)), 
. ^ /I \ : f'^-tv\ 1 (t-lf 



Let 



1-v J - 2 1-v 



.^ ,1 \ f'^-tv\ l(t-lf 



1-v J 2 1-v 
Now by direct calculation, fs{l) =0, and 



fiit)=cP',it)+Q--iy 



1 — V J \1 — V J 1 — V 



1-v' 
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SO that fg{l) = 0. Furthermore, 

Is [t) = (t>s [t) + 



1 — V \ 1 — V J 1 — V 

^{a-„M';(t)+»<(l^)-i} 



1-v 



(since is concave for 1 < s < 2) 



1 [/ 1-?; ^2-. 



u <o 



1 - u i Vt(i -tw), 

using the fact that v{l — v) < vt{l — vt) for <v <vt <l/2 imphes (1 — 
v)/{t{l — tv)) < 1. Here we have used 

Since l<t< l/{2v), it follows that v < vt < 1/2 < 1, 1-v > I- vt > 1/2 > 
and 1 > (1 - vt)/{l -v)> 1/(2(1 - v)). When s < 1, we calculate 

/r(*)=c(*)-(^)V;"(i^ 

and note that /^"(l) < 0, while f'J'^t) = has a unique root, so to show 
f'J{t) < 0, it suffices to show /^'(l/(2z;)) < for < u < 1/2. By a straight- 
forward calculation, we get 

l-v\'^-'' 1 



/:'(i/(2„))=,2„)-+^(i^) 



l-v\ 1/2 J 1-v' 

which is < for < u < 1/2 if s > -1. This shows that fj{t) < in the 
range — 1 < s < 1, and completes the proof of (i). 

(ii) By expanding Ks{u,v) as a function of u as in (10), 

Ks{u,v) = K2{u,v){l + v{l - v)Ds{u*,v) - 1} 

with \u* — v\ < \u — v\; since < v <u, we necessarily have < v < u* < u. 
Here 

v\'^~' 1 r/l-t;^2-s 



.(1 - v)D,{u\v) - 1 = (1 - (^-rj - V + ^( [t^ ) - 1 
= / + //. 
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Now <v <u* <u implies 1 < u* /v < u/v, so v/u < v/u* < 1 and 1 — v > 
1-u* > 1-n implies {l-v)/{l-u) > {l-v)/{l-u*) > 1. Thus, / < and 
// > 0. It follows that 

v{l-v)Ds{u*,v)-l<v!^Q^y 

Similarly, 

v{l-v)D,{u\v)-l>{l-v)!^(^^y 



in this range, and the claimed bound in the first part of (ii) follows. 

To prove the second part of (ii), note that when ti/u — > oo, we can write 

Ksiu,v) _ 1 - {u/vYv - {{1 - u)/{l - v)y{l - v) 
v4)s{u/v) v{l — s + s{u/v) — {u/vY} 

_ {u/vYv + ((1 - n)/(l - ^))"(1 -v)-l 
{u/vYv — v{l — s) — su 

_ 1 + [((1 - u)/{l - v)Y{l -v)- l]/[{u/vYv] 
~ l-[v{l- s) + su]/[{u/vYv\ 

_ l + A{u,v) 



l-B{u,v) 



where, for 1 < s < 2, 



\ v{l-s) + su l-s 1 
B{u,v)= \ / = ^^ + s{v/uY =o(l) 

and 

{{l-u)/{l-vY%l-v)-l 



A{u,v) 



[u/vYv 

((1 - n)/(l - vY%l -v)-l] + ((1 - n)/(l - vY' - 1 

{u/vYv 

((i-u)/(i-^)r , {{i-u)/{i-vY'-i 

{u/vY {u/vYv 

,^ , I — SU + sv - I ,^ , 

:0 1 + + 1 

[u/vYv 

^ '^(l) + = - s{v/uY~' = o(l). 



Thus, the second part of (ii) holds. 
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The first part of (iv) is proved exactly as in (ii). To prove the second part 
of (iv), we write 

Ks{u,v) = -r^{l - {u/vYv - ((1 - n)/(l - t;))^(l - v)} 



s{l-s) 
1 

1 

41 -s) 



1 

s{l-s) 

1 



















+ 


(1 



u 

V 
s-1 



1-u 
1-v 



-1 + 



(1 — v){s{u — v) + o{u) + o{v)) 
1 — sv + o{v) 



{su{l - iv/u)){l + o(l)) - u{v/u)'-'{l - {v/uY}} 



l-s 



n(l + o(l)). 



(v) The proof of (v) is similar to the proof of (iv). □ 

Lemma 7.3. Suppose that Xi, . . . ,Xn are i.i.d. Fn with < p*{f3) <r < 
f3/3. Then r < 1/4 and for any < rg < r, 



(26) 



sup 



0. 



Proof. Note that F.n(-) = G„(F„(-)), where G.„ is the empirical d.f. of 
n i.i.d. U (0, 1) random variables . . . , ^„ and 

Fn{x) = X + e„{(l - X) - $(4>~^(1 - X) - fin)} > X. 

Thus, with II • lll^ = suY>a<t<h l/(OI> 
F„(x) 



sup 

-'^'^<x<n-*^o 



1 



(27) 
(28) 



Fn(x) 



^{Fn{x)) 



+ 


n-4'-0 








V 


( 










\ + 






1 








n-4r 



F„,(x) 



n-*'"0 



V 



1 



G„(F„(x)) 



„-4'-0 



The second term in this last display converges to in probability easily since 
Fn{x)/x > 1 implies that it is bounded by 



1 



njFnjx)) 
Fn{x) 



< 



1 



Gn{t) 
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by Theorem of Wehner [42]. On the other hand, 

X Fnjx) X 

GnjFnjx)) _ \ Fnjx) ^ f Fnjx) _ ^ 
Fnjx) ) X \ X 

SO again by Theorem of Wellner [42] , the first term of (27) converges to 
in probability if 



/ Fnjx) 

limsup ||F„(x)/x||^_4r < oo and sup i 1 



n-4'-<x<n-'*''0 V ^ 



0. 



But this holds by a straightforward analysis using the asymptotics of <1> ^ 
when r < (3/3. □ 

Now we have the tools in place to prove our extension of the results of 
Donoho and Jin [15]. 

Proof of Theorem 5.1. First consider 1 < s < 2. As in Donoho and 
Jin [15], we first consider the case r < /3/3. Then r < 1/4 and we can choose 
< ro < r < 1/4. From Lemma 7.3 the convergence (26) holds. Thus, by 
part (ii) of Lemma 7.2, it follows that for n~^^ < a; < n~^^°, we have 

ni^+(F„(x),x) = i(i^^|l=|py(l + o,(l)), 

and hence, 

nS+js) > sup nKt j¥njx),x) > ^HC:%^jl + Opjl)). 

Thus, nS^js) separates Hq and h["'^ for s G (1, 2) and r < (3/3. 

Now suppose that r > (1 — y/1 — /3)^ (and still 1 < s < 2). Since jr + 
(3)/j2^/r) < 1, we can pick a constant q <1 such that 

^^Vv/?<V5<L 

2V^ 

As argued by Donoho and Jin [15], under h["'\ n¥njn~'^) ~ Binomial(77., 
L„n-[^+(v^-v^)']), where L^n-t'^+^v^-v^)"! > n""?; here L„ is a logarithmic 
term that does not contribute significantly to the argument. Hence, we have 
¥njn^'')/n^'' ^ 1, and thus, from part (ii) of Lemma 7.2 again, 

..+ .^ . n¥njn-i) f f ¥njn-'') Y-^ \ 

nA + (F„(7i "),« '?) = _^-_^|(^___j _s|(l + Op(l)). 
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Hence, we conclude that 

> nK+(F„(n-^),n-'^) = (^^^)''' " + 

so using /?+ {^/q— V^)^ < g < 1, we conclude that nS^{s) separates Hq and 

if^"'' in this range. 

Now consider — 1 < s < 1. For this range of s the argument is exactly 
the same as above, but now using parts (iv) and (v) of Lemma 7.2. (Note 
that the conclusion of Lemma A.l of Donoho and Jin [15] can be strength- 
ened considerably as follows: if Z„ ~ Bin(n,7r„) with 7r„ ^ and nvr^ — > oo, 
then Zn oo; i.e., for any number M > 0, we have P{Zn > M) — > 1. This 
follows easily from Theorem of Wellner [42] since |Z„/(n7r„) — 1| so 

Zn = {Zn/mrn)mTn ^ 1 • OO = OO. This also follows easily from the Paley- 
Zygmund inequality (see, e.g., Kallenberg [29], page 40): P{Zn > rE{Zn)) > 
{l-r)\{EZn?/[EZl].) □ 
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